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DETAILED ACTION 

1 . A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1 .17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
December 14, 2009 has been entered. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1 , 2, 4 - 10, 12, 13, 16 - 20, 23, and 26 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Obrador(US 7,149,755 B2), hereinafter Obrador, in 
view of Khan et al. (US 7,277,766), hereinafter Khan , and Liou et al. (US 6,278,446), 
hereinafter Liou , and in further view of Csicsatka (US 2003/0158737), hereinafter 
Csicsatka . 

Claim 1 : Obrador discloses a method for creating or accessing a menu for audio 
content stored in a storage means, the content consisting of audio tracks, and the menu 
containing representations of said audio tracks, the method comprising: 
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classifying ("organized") the audio tracks ("As used herein, the term "media 
object" refers broadly to any form of digital content, including text, audio, graphics, 
animated graphics and full-motion video," Column 3 Lines 55 - 58 also "digital content 
may be compressed using a compression format that is selected based upon the digital 
content type (e.g., an MP3 or a WMA compression format for audio works," Column 4 
Lines 3-6) into groups or clusters (see "Browsing a Media Object Cluster Hierarchy," 
Column 9), wherein said classification is performed according to characteristic 
parameters of said audio tracks ("The metadata similarity may correspond to low-level 
features (e.g., motion activity, texture or color content, and audio content) or high-level 
features (e.g., meta data, such as keywords and names; objects, such as persons, 
places and structures; and time-related information, such as playback length and media 
object creation date). One or more known media object processing techniques (e.g., 
pattern recognition techniques, voice recognition techniques," Column 9 Lines 53 - 67); 

detecting addition of a new audio track ("As these collection grow in number and 
diversity, individuals and organizations increasingly will require systems and methods 
for organizing and browsing the digital content of their collections," Column 1 Lines 1 8 - 
21 , and therefore the system must detect new audio tracks in order to organize the 
growing collection.); 

determining characteristic parameters of the new audio track ("metadata 
similarity criteria"). 

Obrador does not disclose wherein said characteristic parameters comprise 
physical features, perceptual features, and psychological features, wherein, physical 
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features comprise one or more of spectral centroid, short-time energy, or short-time 
average zero-crossing, and wherein perceptual features comprise one or more of 
rhythm and tonality. Obrador does state with reference to browsing and organizing 
media, "In some embodiments, the relevance criteria used to select the media objects 
that will be presented contemporaneously with the selected media file may relate to a 
selected metadata similarity between media objects and the selected media file. The 
metadata similarity may correspond to low-level features (e.g., motion activity, texture or 
color content, and audio content) or high-level features (e.g., meta data, such as 
keywords and names; objects, such as persons, places and structures; and time-related 
information, such as playback length and media object creation date). One or more 
known media object processing techniques (e.g., pattern recognition techniques, voice 
recognition techniques, color histogram-based techniques, and automatic pan/zoom 
motion characterization processing techniques) may be used to compare media objects 
to the selected media file in accordance with the selected metadata similarity criterion," 
Column 9 Lines 48 - 67. Khan discloses a method and system for analyzing digital 
audio files. Khan teaches, "One advantage of the foregoing aspects of the present 
invention is that unique audio signatures may be assigned to audio files. Also various 
attributes may be tagged to audio files. The present invention can generate a 
customized playlist for a user based upon audio file content and the attached attributes. 
Hence making the music searching experience easy and customized," Column 3 Lines 
24 - 30. "Some of the features that can be associated with the audio files are: (a) 
Emotional quality vector values that indicates whether an audio file content is Intense, 
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Happy, Sad Mellow, Romantic, Heartbreaking, Aggressive or Upbeat, (b) Vocal vector 
values that indicates whether the audio file content includes a Sexy voice, a Smooth 
voice, a Powerful voice, a Great voice, or a Soulful voice, (c) Sound quality vector 
values that indicates whether the audio file content includes a strong beat, is simple, 
has a good groove, is fast, is speech like or emphasizes a melody, (d) Situational 
quality vector values that indicate whether the audio file content is good for a workout, a 
shopping mall, a dinner party, a dance party, slow dancing or studying, (e) Ensemble 
vector values indicating whether the audio file includes a female solo, male solo, female 
duet, male duet, mined duet, female group, male group or instrumental, (f) Genre vector 
values that indicate whether the audio file content belongs to a plurality of genres 
including Alternative, Blues, County, Electronics/Dance, Folk, Gospel, Jazz, Latin, New 
Age, Rhythm and Blues (R and B), Soul, Rap, Hip-Hop, Reggae, Rock and others, (g) 
Instrument vectors that indicates whether the audio file content includes an acoustic 
guitar, electric guitar, bass, drum, harmonica, organ, piano, synthesizer, horn or 
saxophone," Column 7 Lines 19-45. Khan continues, "As discussed in step S901, 
certain features or parameters are extracted from an audio file signal. The features of 
this methodology are based on Short Time Fourier Transform (STFT) analysis," Column 
8 Lines 56 - 60. The following STFT-based features may be extracted in step S901 : 
Spectral Centroid, Spectral Rolloff, Spectral Flux, Peak Ratio, Subband energy vector, 
Subband flux, and Subband Energy Ratios, Column 9 Lines 12-56. Therefore, since 
Obrador suggests using various analysis techniques to capture various features of 
audio content, it would have been obvious to one of ordinary skill in the art at the time of 
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the invention to use the well known digital audio analysis techniques as disclosed by 
Khan to capture the various features of the digital audio content, thereby realizing the 
aforementioned advantages. 

Obrador and Khan further disclose selecting automatically a first audio track as 
being a representative for the cluster, wherein the medoid of the cluster is selected ("For 
example, the media objects may be ordered in accordance with a selected context 
criterion, and the representative media object may correspond to the centroid or some 
other statistically-weighted average of a selected cluster of the ordered media objects," 
Obrador Column 10 Lines 35 - 39); 

automatically generating a reproducible audio extract from said representative 
audio track; and associating said audio extract as representative of said cluster to a 
menu list ("Media objects 98 may be indexed with logical links into the set of data 
structure sequences, as shown in Fig. 8A. Each data structures sequence link into a 
media file may be identify a starting point in the media file and the length of the 
corresponding sequence," Obrador Column 7 Lines 46 - 50 also "The media file and the 
media objects preferably are presented to the user through multimedia album page, 
which is a windows-based GUI that is displayed on a display monitor 42 (Fig. 2)," 
Obrador Column 8 Lines 3 - 7). 

While Obrador and Khan must be able to detect and determine characteristic 
parameters of new audio tracks in order to organize and add to the growing collection, 
Obrador and Khan do not disclose the specifics of clustering the newly added track thus 
one of ordinary skill in the art at the time of the invention would be motivated to look 



Application/Control Number: 1 0/541 ,577 Page 7 

Art Unit: 2614 

elsewhere for such a teaching since a teaching of how new tracks are added is 
necessary for organizing a dynamic database as taught by Obrador and Khan . As a 
result of the missing teachings, Obrador and Khan are silent to the claimed limitations 
regarding, determining that dissimilarity between the newly added track and existing 
clusters, according to said characteristic parameters used for classification, reaches at 
least a defined minimum level; upon said determining, automatically creating a new 
cluster; assigning the new audio track to said new cluster, second cluster; upon said 
creating the second cluster, classifying one or more further audio tracks of said audio 
tracks into the second cluster. 

Liou discloses a method for organization and browsing of media similar to 
Obrador . Liou further teaches in the art of clustering with regards to organizing media 
and in particular adding new media," The preferred shot [in the case of Obrador and 
Khan , the "shot" would refer to the "audio track"] grouping method is based on nearest 
neighbor classification, combined with a threshold criterion. This method satisfies the 
constraints discussed above, where no a priori knowledge or model is used. The initial 
clusters are generated based on the color feature vector [in the case of Obrador and 
Khan , the "color feature vector" would refer to the "audio feature vector"] of the shots 
[audio tracks]. Each initial cluster is specified by a feature vector which is the mean of 
the color feature vectors [audio feature vectors] of its members. When a new shot 
[audio track] is available, the city block distance between its color feature vector [audio 
feature vector] and the means or feature vectors of the existing clusters is computed. 
The new shot [audio track] is grouped into the cluster with the minimum distance from 
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its feature vector, provided the minimum distance is less than a threshold. If an existing 
cluster is found for the new shot [audio track], the mean (feature vector) of the cluster is 
updated to include the feature vector of the new shot [audio track]. Otherwise, a new 
cluster is created with the feature vector of the new shot [audio track] as its mean. The 
threshold is selected based on the percentage of the image pixels [e.g., audio samples] 
that need to match in color [audio feature], in order to call two images [audio tracks] 
similar," Column 1 0 Lines 35 - 53 and Figure 1 1 . Liou further states, "Other features 
may also be used to produce the clusters, including audio similarity," Column 1 1 Lines 1 
-5. 

Therefore, as seen above as indicated by the bracketed text added by the 
Examiner, it would have been obvious to one of ordinary skill in the art of clustering at 
the time of the invention to incorporate the teachings of Liou regarding the claimed 
determining that dissimilarity between the newly added track and existing clusters, 
according to said characteristic parameters used for classification, reaches at least a 
defined minimum level; upon said determining, automatically creating a new cluster; 
assigning the new audio track to said new cluster, classifying the audio tracks into the 
groups or clusters, including the second cluster, thus motivating one of ordinary skill in 
the art to look elsewhere for such a teaching in order to realize the invention, in the 
invention of Obrador and Khan thereby providing a teaching for adding new audio tracks 
in a suitable manner in order to organize the growing collection of Obrador and Khan , 
since as stated by Liou . "Other features may also be used to produce the clusters, 
including audio similarity," Column 1 1 Lines 1 - 5. 
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While Obrador , Khan , and Liou do not disclose a portable audio playback device 
without display, Csicsatka discloses a portable audio player and teaches, "In this 
embodiment, the user interface provides menu driven selection, sorting, and playback of 
audio data files and display of elapsed playback time, volume level, and preset DSP 
mode. Additionally, prior to or after playback of an audio data file, the audio data player 
can play back the informational audio tag files associated with a selected audio data file 
in order to announce the selection. The audio tag file functions as part of a user 
interface, thereby allowing a user to select a particular audio data file, without having to 
view the LCD, by stepping through selections and listening to the announced 
information associated with audio data file," [00191. Csicsatka goes on to explain, 
"Alternatively, audio tag files may be played every time forward skip or reverse skip is 
selected by the user of a playback device. Alternatively, the player may include a 
designated key, or a keystroke sequence, that allows the user to call up and playback 
the audio tag information at anytime during playback of a selected audio file. Upon 
activation of such a key, the playback of the selected audio file is muted, paused, or the 
volume is lowered, while the audio tag information is played back, and upon completion 
of the tag information the playback of the selected audio file continues as before. Such 
a feature would advantageously allow the user to obtain the audio tag information 
during playback of a track when it is difficult or inconvenient to obtain the associated 
information by looking at a visual display," [0044]. Given the teachings of Csicsatka of 
using "audio tag file functions as part of a user interface, thereby allowing a user to 
select a particular audio data file, without having to view the LCD," it would have been 
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obvious to one of ordinary skill in the art that a similar method could be used in a device 
without a display thereby providing a reduction in components thus decreasing the cost 
associated with having a display while also reducing the size of the device by the size 
needed for a display. Therefore, given the aforementioned advantages it would have 
been obvious to one of ordinary skill in the art at the time of the invention to incorporate 
the teachings of Csicsatka of using audio tags in a device without a display which has 
the audio capabilities as taught by Qbrador , Khan , and Liou, thereby improving the 
device of Qbrador , Khan , and Liou . 

Claim 2: Qbrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
wherein said characteristic parameters used for classification of audio content comprise 
one or more audio descriptors, the audio descriptors being either physical features, or 
perceptual features, or psychological or social features of the audio content ("The 
metadata similarity may correspond to low-level features (e.g., motion activity, texture or 
color content, and audio content) or high-level features (e.g., meta data, such as 
keywords and names; objects, such as persons, places and structures; and time-related 
information, such as playback length and media object creation date). One or more 
known media object processing techniques (e.g., pattern recognition techniques, voice 
recognition techniques," Qbrador Column 9 Lines 53 - 67) 

Claim 4: Qbrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
wherein the audio tracks within a cluster have variable order, so that the user listens to 
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a randomly selected track when having selected a cluster, with said track belonging to 
said cluster (variable based on similarity, Obrador). 

Claim 5: Obrador , Khan , Liou, and Csicsatka disclose the method according to claim 1 , 
wherein a user can modify the result of automatic classification of audio tracks (e.g., by 
choosing a different anchor, Obrador ). 

Claim 6: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
wherein a user can modify the classification rules for automatic classification of audio 
tracks (e.g., by choosing a different anchor, Obrador ). 

Claim 7: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
wherein the actual audio data are clustered within said storage means according to said 
menu ("The media file and the media objects preferably are presented to the user 
through multimedia album page, which is a windows-based GUI that is displayed on a 
display monitor 42 (Fig. 2)," Obrador Column 8 Lines 3 - 7). 

Claim 8: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
wherein the audio extract is a sample from the audio track ("Media objects 98 may be 
indexed with logical links into the set of data structure sequences, as shown in Fig. 8A. 
Each data structures sequence link into a media file may be identify a starting point in 
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the media file and the length of the corresponding sequence," Obrador Column 7 Lines 
46-50). 

Claim 9: Obrador , Khan , Liou, and Csicsatka disclose the method according to claim 1 , 
wherein audio extracts are created additionally for audio tracks not being 
representatives of clusters ("Media objects 98 may be indexed with logical links into the 
set of data structure sequences, as shown in Fig. 8A. Each data structures sequence 
link into a media file may be identify a starting point in the media file and the length of 
the corresponding sequence," Obrador Column 7 Lines 46 - 50). 

Claim 10: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
1 , wherein the length of audio extracts is not predetermined ("Media objects 98 may be 
indexed with logical links into the set of data structure sequences, as shown in Fig. 8A. 
Each data structures sequence link into a media file may be identify a starting point in 
the media file and the length of the corresponding sequence," Obrador Column 7 Lines 
46-50. 

Claim 12: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
1 , wherein said menu is hierarchical, such that a cluster may contain one or more 
subclusters (see "Browsing a Media Object Cluster Hierarchy," Obrador Column 9). 
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Claim 13: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
1 , wherein the classification rules are modified automatically if a defined precondition is 
detected, and a reclassification may be performed (e.g., by choosing a different anchor, 
Obrador ). 

Claim 20: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
1 , wherein said one or more further audio tracks classified into the second cluster were 
previously classified in a different first cluster ("Delete member from cluster and place in 
new cluster recomputed cluster mean," step L Figure 1 1 of Liu), the method further 
comprising the steps of: 

selecting automatically a second audio track being a representative for the first 
cluster, wherein the medoid of the new cluster is selected (Since "objects are grouped 
into clusters, each of which preferably contains a fixed number of media objects," there 
must be the creation of new clusters when the collection grows in number and diversity, 
and therefore the system selects a media object corresponding to "the centroid or some 
other statistically-weighted average of a selected cluster of the ordered media objects," 
Obrador Column 1 0 Lines 1 8 - 39); 

automatically generating a reproducible new audio extract from the second audio 
track; and associating said new audio extract of the second audio track as 
representative of the first cluster to the menu list ("Media objects 98 may be indexed 
with logical links into the set of data structure sequences, as shown in Fig. 8A. Each 
data structures sequence link into a media file may be identify a starting point in the 
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media file and the length of the corresponding sequence," Obrador Column 7 Lines 46 - 
50 also "The media file and the media objects preferably are presented to the user 
through multimedia album page, which is a windows-based GUI that is displayed on a 
display monitor 42 (Fig. 2)," Obrador Column 8 Lines 3 - 7). 

Claim 26: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
20, wherein another audio track was a representative of the first cluster before the new 
audio track was added, and said first audio track being representative of the first cluster 
is different from the other audio track that was representative of the first cluster before 
the new audio track was added ("For example, the media objects may be ordered in 
accordance with a selected context criterion, and the representative media object may 
correspond to the centroid or some other statistically-weighted average of a selected 
cluster of the ordered media objects," Obrador Column 1 0 Lines 35 - 39, and "Update 
Cluster Mean", "Recompute Cluster Mean", Liou Figure 1 1 ). 

Claims 16 - 18 are substantially similar in scope to claim 1 and is also disclosed in 
Figure 2, and therefore is rejected for the same reasons as claim 1 with addition of 
Figure 2. 

Claim 19: Obrador . Khan . Liou . and Csicsatka disclose the method according to claim 
1 , wherein the audio extract is an audio sequence being synthesized from the actual 
audio track rather than being an original sample ("Media objects 98 may be indexed 
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with logical links into the set of data structure sequences, as shown in Fig. 8A. Each 
data structures sequence link into a media file may be identify a starting point in the 
media file and the length of the corresponding sequence. The data structure sequences 
may be consecutive, as shown in FIG. 8B, or non-consecutive," Obrador Column 7 
Lines 46 - 55, and therefore a non-original sample sequence. 

Claim 23 is substantially similar in scope to claim 20 and therefore is rejected for the 
same reasons. 

4. Claims 3, 1 1 , and 27 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Obrador , Khan , Liou , and Csicsatka in view of Piatt (US 6,987,221 ), hereinafter 
Piatt. 

Claim 3: Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 1 , 
but do not disclose whether or not an audio track can be classified into more than one 
cluster. Piatt discloses a similar clustering technique for audio and while not explicitly 
stated teaches, the tracks are placed in the playlist based upon the results of a vector 
which is based upon multiple attributes of the item (Column 10 Lines 9 - 48). Therefore, 
it would have been obvious to one of ordinary skill in the art that when generating 
multiple playlists as disclosed by Piatt that the system of Piatt may decide that a song 
may have the minimum required attributes necessary to match more than one playlist 
category and therefore be classified in more than one playlist. Since excluding songs 
from being in more than one playlist would be disadvantages to the user, since the user 
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wants the best matching songs in each playlist. Therefore, when applying a similar 
technique in Obrador , Khan , Liou, and Csicsatka , it would have been obvious to one of 
ordinary skill in the art at the time of the invention to generate clusters in a similar 
manner. 

Claim 27: Obrador , Khan , Liou , Csicsatka , and Piatt disclose the method according to 
claim 3, wherein a track is classified into two clusters and both clusters contain a link to 
said track ("link", "pointer", Obrador Column 6 Lines 39 - 47) and wherein the track is 
stored only once ("all media files in a selected collection are stored only once in data 
base 96 (FIG. 7B)," Obrador Column 7 Lines 40 - 45). 

Claim 1 1 : Obrador , Khan , Liou , and Csicsatka disclose the method according to claim 
1 , but do disclose wherein one of said clusters has no representative track. Piatt 
discloses a similar clustering technique for audio and while not explicitly stated teaches 
how to determine the order among seed items when more than one seed item is 
selected. And therefore while one of ordinary skill in the art may consider any one of the 
seed items in this case to be the representative track, it would also have been obvious 
to one of ordinary skill in the art at the time of the invention that a representative track 
does not exist since a determination cannot be made among seed items. Therefore, 
when applying a similar technique in Obrador . Khan . Liou . and Csicsatka . it would have 
been obvious to one of ordinary skill in the art at the time of the invention to generate 
clusters in a similar manner. 
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5. Claims 14 and 15 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Obrador , Khan , Liou, and Csicsatka in view of Mercer et al. (US 7,043,477), 
hereinafter Mercer . 

Claims 14 and 15: Obrador , Khan , Liou , and Csicsatka disclose the method according 
to claim 13, but do not disclose wherein said precondition comprises that the difference 
between the number of tracks in a cluster and the number of tracks in another cluster 
reaches a maximum limit value, and wherein said precondition comprises that all stored 
tracks were classified into one cluster, and the total number of tracks reaches a 
maximum limit value. Mercer discloses where bounds are set when determining the size 
of playlists (Column 8 Line 40 - Column 9 Line 62). Therefore, it would have been 
obvious to one of ordinary skill in the art given the teaching of Mercer to incorporate a 
limit between two playlists or a single sequence in the invention of Obrador , Khan , Liou , 
and Csicsatka to determine how classification is performed, thereby allowing for 
example "If composer information is available for some of the selected media files (e.g., 
"if greater than twenty-five percent), the authoring software creates a menu 'Composer' 
..." thereby further automating the classification process, Mercer Column 9 Lines 22 - 
27. 

6. Claim 24 is rejected under 35 U.S.C. 103(a) as being unpatentable over Obrador . 
Khan , Liou , and Csicsatka in view of Robinson (US 7,072,846 B1), hereinafter 



Application/Control Number: 1 0/541 ,577 Page 1 8 

Art Unit: 2614 

Robinson, with further support from Ferhatosmanoglu et al. (Approximate Nearest 
Neighbor Searching in Multimedia Databases), hereinafter Ferhatosmanoglu . 

Claim 24: Obrador , Khan , Liou, and Csicsatka disclose the apparatus according to 
claim 16 but do not disclose wherein the means for assigning one or more of the audio 
tracks of said first cluster to the new second cluster to the new cluster uses the K- 
means algorithm to decide which audio tracks are assigned to the second cluster. 
Robinson discloses a similar method and system for clustering songs and 
recommending the best song in the cluster to the user, Column 13 Lines 46 - 54. 
Robinson also teaches setting "the average number of songs desired per cluster," 
Column 4 Lines 65 - 67, similar to Obrador's teaching of fixing the number of media 
objects in a cluster. Robinson further explains, "As new songs are added to the system, 
new clusters are automatically created such that the average number of songs remains 
approximately the same; the optimization process then populates the cluster. These 
clusters, in various embodiments, may start out empty before they are optimized, or 
may be initially populated with new songs or randomly chosen songs," Column 4 Line 
67 - Column 5 Line 7. Robinson also teaches that a wide range of clustering 
approaches fall within the scope of the invention and gives provides source code for the 
standard k-means clustering concept as an example. To further support the technique 
of Robinson . Ferhatosmanoglu teaches the "k-means algorithm [13] iteratively 
constructs a number of clusters with a representative for each cluster such that the error 
in representation is minimized," Page 506 Column 2. Ferhatosmanoglu like Obrador and 
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Robinson also teaches the clustering algorithm limits "the size of each cluster from both 
above and below," Page 507 Column 1. Ferhatosmanoqlu explains, "If the size goes 
above the upper threshold, the cluster is split into two. If the size goes down below the 
lower threshold, then the cluster centroid is erased from the list of centroids. To split a 
cluster, we first duplicate the cluster centroid, and then perturb the exact copies 
randomly. It is known that K-means algorithm is sensitive to initialization. Since we have 
this splitting mechanism, instead of starting from cluster centroids chosen by some pre- 
processing scheme, we start by a single cluster, and the algorithm automatically creates 
new clusters until the population of each cluster is below the threshold. As we will 
demonstrate later, by having a lower threshold for cluster size, several queries can be 
answered by retrieving only a very small number of clusters. Also, by limiting the cluster 
sizes from above, we avoid extremely unbalanced distribution of data over the clusters. 
Although the minimum and maximum cluster sizes are not dominant factors in the 
performance of our technique, reasonable values need to be set for the design 
purposes. Therefore, given the teachings of Robinson and Ferhatosmanoqlu , it would 
have been obvious to one of ordinary skill in the art at the time of the invention to use 
the k-means algorithm as suggested by Robinson and further explained by 
Ferhatosmanoqlu with limits placed on the size of the clusters when adding new songs 
to a collection in the invention of Qbrador , Khan , Liou , and Csicsatka , thereby realizing 
the aforementioned advantages while fixing the number of media objects in a cluster 
that may be conveniently presented to a user at the same time ( Qbrador Column 10 
Lines 23 - 27). 
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Allowable Subject Matter 

7. Claims 28 and 29 are objected to as being dependent upon a rejected base 
claim, but would be allowable if rewritten in independent form including all of the 
limitations of the base claim and any intervening claims. 

Response to Arguments 

8. Applicant's arguments filed June 8, 2010 have been fully considered but they are 
not persuasive. Applicant argues, "Lieu puts all of its "shots" into clusters, creating new 
clusters whenever a given shot is sufficiently dissimilar to existing shots (block D). 
When no more shots exist, it moves onto the cited portion of the procedure, H-P. 
However, although block L places members into new clusters, such placement is not 
done upon the creation of a new cluster. In fact, it is entirely possible for Liou's 
procedure to use one, and only one, cluster, never creating a new cluster at all. For 
example, if the shots all fall within the given color threshold, the procedure will never 
reach block F and no new clusters will be created. Even if clusters were to be created, 
there is no logical connection between such creation (block E) and the eventual 
movement of movement of members between clusters (block L). Bear in mind that the 
claim recites "upon said creating the second cluster." This language :' creates a causal 
relationship between the creation of the second cluster and the classification of further 
audio tracks into said second cluster. Lieu fails to disclose or suggest such a causal 
relationship". The Examiner respectfully disagrees with this argument because the claim 
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language like block D of Liou also calls for " determining that dissimilarity between the 
newly added track and existing clusters, according to said characteristic parameters 
used for classification, reaches at least a defined minimum level" and therefore 
Applicants claim also would not reach the steps "automatically creating a new, second 
cluster; assigning the new audio track to said new, second cluster; upon said creating 
the second cluster" because these steps are only performed "upon said determining". 
9. Applicant also argues, "It is respectfully pointed out to the Examiner that medoids 
are distinct from centroids. As those having ordinary skill in the art will recognize, a 
medoid will always be a member of the data set it refers to, whereas a centroid has no 
such requirement. In this respect, they are analogous to the concepts of "mean" and 
"median." For example, taking the data points {1,9, and 11}, the centroid will be 7, 
whereas the medoid will be 9. Citing the use of a centroid therefore does not read on 
the present claims, as a centroid will not necessarily be a medoid. Neither Khan nor 
Liou make any reference to medoids and cannot cure the deficiencies of Obrador in this 
respect. It is therefore respectfully asserted that Obrador, Khan, and/or Liou, taken 
alone or in combination, fail to disclose or suggest selecting the medoid audio track as 
being a representative of the second cluster". In response to applicant's argument that 
the references fail to show certain features of applicant's invention, it is noted that the 
Obrador discloses, "the representative media object may correspond to the centroid or 
some other statistically-weighted average of a selected cluster of the ordered media 
objects ," Obrador Column 10 Lines 35 - 39). Therefore, as admitted by Applicant in the 
argument that a centroid is analogous to "mean" and a medoid is analogous to "median" 
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and both "mean" and "median" are very well known averages, likewise Obrador's 
statement encompasses the use of the medoid since a medoid is some other 
statistically-weighted average of a selected cluster of the ordered media objects other. 
At the very least it would have been obvious to one of ordinary skill in at the time of the 
invention to use a medoid given Obrador's statement. 

1 0. Further Applicant argues, "Liou teaches not to use K-means clustering," however 
the specific teaching of Liou at the column and lines indicted by Applicant states, "The 
number of potential clusters a priori is not known, so known K-means clustering and 
other strategies using this a priori information are also not useful". Therefore Liou does 
not teach that you would not want to ever want to use K-means clustering only that K- 
means clustering techniques known to Liou at the time are not useful. Again it is noted 
that this limitation regarding K-means does not appear in the independent claims but in 
claim 24 in which the Examiner relied on Robinson and Ferhatosmanoglu for teaching a 
K-means algorithms that address the original concerns of Liou. Specifically, Liou states, 
"The choice of clustering strategy is limited by having no a priori knowledge of the 
number of clusters or assumptions about the nature of the clusters," however in the 
case of Ferhatosmanoglu who uses a modified K-means clustering strategy, the 
strategy is not limited by having no a priori knowledge of the number of clusters or 
assumptions about the nature of the clusters. Specifically, Ferhatosmanoglu states, "It is 
known that K-means algorithm is sensitive to initialization. Since we have this splitting 
mechanism, instead of starting from cluster centroids chosen by some pre-processing 
scheme, we start by a single cluster, and the algorithm automatically creates new 
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clusters until the population of each cluster is below the threshold," page 507 column 1 
of Ferhatosmanoglu. Therefore Ferhatosmanoglu's modified K-means is not limited by a 
having no a priori knowledge of the clusters since Ferhatosmanoglu's uses a splitting 
scheme instead of a pre-processing scheme requiring a prior knowledge. 



Conclusion 

1 1 . Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See M PEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Joseph Saunders whose telephone number is (571) 
270-1063. The examiner can normally be reached on Monday - Thursday, 9:00 a.m. - 
4:00 p.m., EST. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Vivian Chin can be reached on (571) 272-7848. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/J. SV 

Examiner, Art Unit 2614 



/Xu Mei/ 

Primary Examiner, Art Unit 2614 



